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DETAILED ACTION 

1 . In view of the Appeal Brief filed on 04/1 7/2008, PROSECUTION IS HEREBY 
REOPENED. A New Grounds for rejection is set forth below. 

To avoid abandonment of the application, appellant must exercise one of the 
following two options: 

(1 ) file a reply under 37 CFR 1.111 (if this Office action is non-final) or a reply 
under 37 CFR 1.113 (if this Office action is final); or, 

(2) initiate a new appeal by filing a notice of appeal under 37 CFR 41 .31 followed 
by an appeal brief under 37 CFR 41 .37. The previously paid notice of appeal fee and 
appeal brief fee can be applied to the new appeal. If, however, the appeal fees set forth 
in 37 CFR 41 .20 have been increased since they were previously paid, then appellant 
must pay the difference between the increased fees and the amount previously paid. 

A Supervisory Patent Examiner (SPE) has approved of reopening prosecution by 
signing below: 

/Patrick N. Edouard/ 

Supervisory Patent Examiner, Art Unit 2626 

2. This communication is in response to the Appeal Brief filed on 04/17/2008. 
Claims 1 , 3, 4, 6,7, and 14-25 are pending and have been examined. The Applicants' 
amendment and remarks have been carefully considered, but they do not place the 
claims in condition for allowance. 
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3. All previous objections and rejections directed to the Applicant's disclosure and 
claims not discussed in this Office Action have been withdrawn by the Examiner. 



Response to Arguments 

4. Applicant's arguments in the Appeal Brief (pages 10-19) filed on 04/17/2008 with 
regard to claims 1 , 3, 4, 6,7, and 21-25 have been fully considered but they are not 
persuasive (e.g. The arguments as stated in the Final office Action appear below). 
However, upon further consideration, the secondary reference of Hon (903) was 
withdrawn regarding claims 14-20. 

As to the arguments regarding claim 1 , the Applicants argue that Nassiff does not 
teach the limitation of "modify a probability associated with an existing pronunciation" 
since the language model is updated and not the acoustic model. In response to 
applicant's argument that the references fail to show certain features of applicant's 
invention, it is noted that the features upon which applicant relies (i.e., updating an 
acoustic model) are not recited in the rejected claim(s). Although the claims are 
interpreted in light of the specification, limitations from the specification are not read into 
the claims. See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993). 
Furthermore, the cited reference does not teach away from the current limitation. 
Further, the following passages are cited in Nassiff, (col. 6, lines 64-65 and col. 7, lines 
43-61) to show the updating of the language model and the relevant statistical scores 
(e.g. probability). Furthermore, the word patterns as disclosed to Nassiff is a 
representation of word sequences (see col. 6, lines 60-66) that consist of probabilities 
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associated with each other. A change in the sequence of word directly affects the 
pronunciation, where the stated reference prevents future misrecognition by updating 
the language model (see col. 6, lines 33-34). 

As to the arguments regarding claim 7, the Applicants argue that Nassiff does not 
teach the limitation of "inferring whether the change is a correction, or editing includes 
comparing a speech recognition score of the dictated text ..."The examiner traverses 
this argument by citing col. 5, lines 32-48 and col. 7, lines 49-63. The system makes a 
determination or inference is a correction or edit has been made. If this has been done 
than the system knows an error has occurred. The latter citation shows a comparison 
between the misrecognized word and the recognized word. A close match using a 
statistical measure is compared and if within a threshold the language model is updated 
or leaned. 

As to the arguments regarding claim 9, the Applicants argue that Nassiff in view 
of Gould does not teach "measuring the amount of time between dictation and the 
change". The examiner traverses these arguments by again citing the passages in 
Gould on page 5, lines 56-59 and on page 7, lines 13-19. An inference is made by first 
allowing the use to correct and error between a predetermined time, which in this case 
is the last three utterances. The system makes an inference my detecting this edit and 
updating speech models and hence meets the limitation as cited in claim 9. 

As to the arguments regarding claim 20, the Applicants argue that Nassiff in view 
of Hon (809) in view of Hon (903) does not teach does not teach "determining whether 
the new pronunciation has occurred a pre-selected number of times." In response to 
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applicant's arguments against the references individually, one cannot show 
nonobviousness by attacking references individually where the rejections are based on 
combinations of references. See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 
1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986). The Nassiff 
reference states the updating or learning of the language model as discussed above in 
claim 1. Furthermore, Hon (903) teaches that if the phoneme has occurred a selected 
number of times then incorrect recognition has occurred (see col. 7, lines 17-23). The 
Nassiff reference states the detection of misrecognition error. The use of the Hon et al. 
reference presents a method to detect misrecognition errors based on frequency of 
misrecognized words (see Hon col. 7, lines 17-23). 

As to the arguments regarding claim 21 , the Applicants argue that Nassiff in view 
of Hon (801 ) does not teach the limitation "adding at least one word pair to the user 
lexicon. " The examiner traverses this argument by citing again the Hon (801) reference 
that presents a method of adding words to the lexicon. Furthermore, the Nassiff 
reference discloses the recognition of two similarly recognizable words. Specifically, 
"steep" and step" as in col. 7, lines 43-60. determination is made if the word is in the 
replacement word in on the list. If it is not then a close match is found. Each word on the 
replacement list represents a corresponding pair to another word that may be 
misrecognized. Hence, the Hon (801 ) reference was used to teach the adding of a word 
to a lexicon, which benefits the correction for speech recognition as, taught by Nassiff 
by updating a replacement word list. Hence the combination of references teaches the 
above limitations. 
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As to claim 22, the Applicants argue that Nassiff in view of Hon (801 ) does not 
teach the limitation "word pair is added to the lexicon temporarily." The examiner 
traverses this argument by citing the Nassiff reference discloses the recognition of two 
similarly recognizable words. Specifically, "steep" and step" as in col. 7, lines 43-60. 
determination is made if the word is in the replacement word in on the list. If it is not 
then a close match is found. Each word on the replacement list represents a 
corresponding pair to another word that may be misrecognized. Hence, the Hon (801) 
reference was used to teach the adding of a word to a lexicon, which benefits the 
correction for speech recognition as, taught by Nassiff by updating a replacement word 
list. Furthermore, the Nassiff etal. reference identified a problem, namely, if 
misrecognition has taken place due to an error or a user edit. The checking to a 
replacement word list is done. The ability to store a word temporarily is relative, where 
the word's score is increased to recognize the words in a future instance (see Hon(801) 
col. 9 lines 1 1-27). This new score associated with the word allows the word in the 
lexicon to be different since adaptation to the acoustic models has been performed. 

Claim Objections 

5. Claim 15 is objected to because of the following informalities: Claim 15 should 
be dependent upon claim 14, where "the wave" is first described. Appropriate 
correction is required. 
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Claim Rejections - 35 USC §112 

6. The following is a quotation of the second paragraph of 35 U.S.C. 1 12: 

The specification shall conclude with one or more claims particularly pointing out and distinctly 
claiming the subject matter which the applicant regards as his invention. 

7. Claims 14-20 are rejected under 35 U.S.C. 112, second paragraph, as being 
indefinite for failing to particularly point out and distinctly claim the subject matter which 
applicant regards as the invention. It is unclear from claim 14, what result occurs if a 
context word does not exist. The current claim limitation denotes the context word 
existing if such word exists. However, it is unknown what occurs when the word does 
not exist. However, for compact prosecution the limitation was interpreted to mean 
surrounding text either appearing or not appearing, hence possibly including a single 
word alignment. 

8. Claims 15-20 are rejected as dependent upon an indefinite base claim. 

9. Claim 17-20 recites the limitation "newly identified pronunciation" and "the new 
pronunciation" in line 2. There is insufficient antecedent basis for this limitation in the 
claim. It is unclear from the claims which result the limitations "newly identified 
pronunciation" and "the new pronunciation" is referring to as it lacks antecedent basis. 
For the purposes of compact prosecution, the limitation was interpreted to refer to the 
corrected pronunciation. 

Claim Rejections - 35 USC § 102 

1 0. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 1 02 that 
form the basis for the rejections under this section made in this Office action: 
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A person shall be entitled to a patent unless - 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public 
use or on sale in this country, more than one year prior to the date of application for patent in the United 
States. 

1 1 . Claims 1 , 4, and 7 are rejected under 35 U.S.C. 1 02(b) as being anticipated by 
Nassiff etal. (US 6,418,410). 

As to claim 1 , Nassiff et al. teaches 

a computer-implemented speech recognition system comprising: 

a microphone to receive user speech (see col. 4, lines 16-18); 
a speech recognition engine coupled to the microphone (see col. 4, lines 
16-17) (e.g. The speech recognition engine receives input from the microphone 
so it is implied that the two are coupled.), and being adapted to recognize the 
user speech (see col. 4, lines 15-19) and provide a textual output on a user 
interface (see col. 2, lines 19-20 and col. col. 5, lines 32-38); and 

wherein the recognition engine is adapted to determine if the user's 
pronunciation caused the error, and selectively modify a probability associated 
with an existing pronunciation (see col. 7, lines 55-66) (e.g. The use of a 
statistical quantity with the updating of a language model implies that a 
probability value is associated with a word when comparisons are made (see col. 
6, lines 28-31)). 

As to claim 4, Nassiff et al. teaches wherein the recognition engine is adapted to 
determine if the user's pronunciation caused the error and selectively learn the new 
pronunciation (see col. 6, lines 45-50 and lines 57-58) (e.g. The determination is made 
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as to whether a misrecognition error has occurred, if so the language model is 
updated.). 



As to claim 7, Nassiff et al. teaches a method of learning with an automatic 
speech recognition system, the method comprising: 

detecting a change to dictated text (see col. 5, lines 33-40, based on 
deletion or typing over the words); 

inferring whether the change is a correction, or editing (see col. 5, lines 
33-48, correction or editing is determined based on deletion (editing) or typing 
over the words (correction.); and 

wherein inferring whether the change is a correction includes comparing 
a speech recognition engine score (see col. 6, lines 28-31 ) of the dictated text 
and of the changed text (see col. 7, lines 50-62). 

if the change is inferred to be a correction, selectively learning from the 
nature of the correction without additional user interaction (see col. 6, lines 45- 
50). 

wherein selectively learning from the nature of correction includes 
determining if the corrected word exists in the user lexicon, selectively learning 
the pronunciation (see col. 6, lines 45-50, the language model is updated when 
the replacement word is found on the alternative word list.) 
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Claim Rejections - 35 USC § 103 

12. The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

13. Claims 3, 6, and 21 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Nassiff et al. in view of Hon et al. (US 5,852,801 ). 

As to claims 3 and 21 , Nassiff et al. teaches the use of a user lexicon (see col. 6, 
line 25 and col. 6, line 28)) (e.g. the alternative word list). 

However, Nassiff et al. does not specifically teach the user updating of 
new words in the lexicon. 

Hon et al. (801 ) does teach the use of a lexicon, which is updated for new 
words (see col. 9, lines 36-40), where words are added when determining if the 
words exist in the user lexicon (see col. 7, lines 66-67 and col. 8, lines 1-3) (e.g. 
The determination is made of whether the word is in the lexicon if it is 
unrecognized), (e.g. Since the language model is updated the temporary storing 
of words in Nassiff based on presence or absence in the user lexicon would be 
obvious to one of skilled in the art. Further, it was stated that the word "two 
much" and "too much" is added to the lexicon, where the words two and too are 
a word pair. Hence, Nassiff teaches a similar word pair being step and steep. 
The misrecognition of step to be steep would be a word pair when added to the 
list of words as taught by Hon. ) 
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It would have been obvious to one of ordinary skilled in the art at the time 
the invention was made to have modified the correction of dictating speech of 
Nassiff et al. with the inclusion of an updating lexicon for adding corrected or new 
words as taught by Hon (801 ). The motivation to have combined the references 
involves the reduction of errors when spoken words are not found in the lexicon 
of the recognition engine so as to adapt to unrecognized words in a speech 
recognition system (see Hon (801) col. 1, lines 33-36 and lines 54-56). 

As to claim 6, Nassiff et al. teaches the updating of the user lexicon not based on 
new words or new pronunciation (see col. 6, lines 45-50) (e.g. Since the updating of the 
language models is performed, the extraction of the specific word will be retrieved and 
hence is an alternate form of a word in the alternate list as indicated by the reference 
(e.g. The example given is "steep" and "step")). 

However, Nassiff et al. does not specifically teach the user adding of new 
words in the lexicon. 

Hon et al. (801 ) does teach the use of a lexicon, which is updated for new 
words (see col. 9, lines 36-40), where words are added (see below) when 
determining if the words exist in the user lexicon (see col. 7, lines 66-67 and col. 
8, lines 1-3) (e.g. The determination is made of whether the word is in the lexicon 
if it is unrecognized). 

It would have been obvious to one of ordinary skilled in the art at the time 
the invention was made to have modified the correction of dictating speech of 
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Nassiff et al. with the inclusion of an updating lexicon for adding corrected or new 
words as taught by Hon (801). The motivation to have combined the references 
involves the reduction of errors when spoken words are not found in the lexicon 
of the recognition engine so as to adapt to unrecognized words in a speech 
recognition system (see Hon (801) col. 1, lines 33-36 and lines 54-56). 

14. Claims 14-19 are rejected under 35 U.S.C. 103(a) as being unpatentable over 

Nassiff et al. in view of Hon et al. (US 5,852,801 ) as applied to claim 1 3 above, and 

further in view of Lewis etal. (US 6,138,099). 

As to claim 14, Nassiff et al. and Hon et al. (US 5,852,801 ) do not teach the 

forced alignment of the wave based on a context word. 

Lewis et al. does teach doing a forced alignment (see Figure 2, step 40, 
comparison of original audio an baseform of replacement text) of a wave (see 
Figure 2, step 36, wave is the text or audio) based on at least one context word if 
such a word exists (see Figure 2, step 36, 38, and 40, the input text of a speech 
session is used for comparison and determination of replacement text is made). 

It would have been obvious to one of ordinary skilled in the art at the time 
the invention was made to have modified the correction of dictated speech 
presented by Nassiff etal. and Hon etal. (US 5,852,801) with the inclusion of 
alignment between two words as taught by Lewis. The motivation to have 
combined the references involves updating language models during speech 
misrecognition without user interaction (see Lewis, col. 1, lines 25-31) as would 
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benefit the speech recognition system presented by Nassiff et al. to enhance 
phonetic and pronunciation recognition. 

As to claim 15, Lewis et al. teaches wherein determining if the user's 
pronunciation deviated from existing pronunciations includes identifying in the wave the 
pronunciation (see step 40 and step 42, where the original and corrected baseform are 
compared to see if deviation exists). 

As to claim 16, Hon et al. (US 5,963,903) teaches wherein building a lattice 
based upon possible pronunciations of the corrected word and the recognition result, 
(see col. Figure 2, step 50, baseform of replacement text generated if not exists and 
compares in step 40.) (e.g. Hence, it is obvious that the original audio/text also has a 
baseform representation in order for comparing the two alignments) 

As to claims 17-19, Nassiff et al. teaches wherein generating a confidence score 
based at least in part upon the distance of the newly identified pronunciation with 
existing pronunciations (see Figure 2, steps 40 and 42, where the two baseforms of the 
original and replaced text are compared to determine whether an acoustic match 
occurs. As to claim 18, a close acoustic match between the two texts are determined 
based on some type of scoring. As to claim 19, in order to propagate from step 42 to 41 
or 42, a threshold is needed). 
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15. Claim 20 is rejected under 35 U.S.C. 103(a) as being unpatentable over Nassiff 
et al. in view of Hon et al. (US 5,852,801 ) in view of Lewis as applied to claim 19 above, 
and further in view of Beaufays et al. (US 7,280,963). 

As to claim 20, Nasiff teaches learning from pronunciation errors (see claim 7, 
above). 

However, Nassiff in view of Hon in view of Lewis do not specifically teach 
the new pronunciation occurring a predetermined number of times. 

Beaufays et al. does teach learning the pronunciation based whether the 
new pronunciation has occurred a pre-selected number of times (see col. 5, lines 
1-10, words below a threshold are removed from transcribed acoustic data in 
order to prevent pronunciation learning of incorrect words. Hence, it can be 
inferred that words above the threshold are retained.). 

It would have been obvious to one of ordinary skilled in the art at the time 
the invention was made to have modified the correction of dictated speech 
presented by Nassiff et al. and Hon etal. (US 5,852,801) with the pronunciation 
occurring a pre-selected number of times as taught by Beaufays for the purpose 
of preventing the learning of incorrect words (see col. 5, lines 1-10). 

16. Claims 22 and 23 are rejected under 35 U.S.C. 1 03(a) as being unpatentable 
over Nassiff et al. in view of Hon et al. (801) as applied to claim 22 above, and further in 
view of Hoffman et a/.(US 2003/0139922). 
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As to claims 22 and 23, Nassiff et al. in view of Hon et al. teach all of the 
limitations as in claim 22, above. 

Furthermore, Nassiff etal. teaches the recognition of two similarly 
recognizable words (word pair). Specifically, "steep" and step" as in col. 7, lines 
43-60. (e.g. Determination is made if the word is in the replacement word in on 
the list. If it is not then a close match is found. Each word on the replacement list 
represents a corresponding pair to another word that may be misrecognized. 

Furthermore, the Hon (801 ) reference was used to teach the adding of a 
word to a lexicon (see col. 9, lines 36-40). 

However, Nassiff in view of Hon et al. do not specifically teach addition of 
a word pair temporarily based on the most recent time the word pair is observed 
and the relative frequency that the pair has been observed in the past. 

Hoffmann et al. teaches the addition of a word to a lexicon (vocabulary) is 
based at least partially upon the most recent time the word pair is observed (see 
[0015], FIFO, where the words not used for a long time are omitted) and the 
relative frequency (see [0015] and [0031], frequency of occurrence, that the pair 
has been observed in the past.) 

It would have been obvious to one of ordinary skilled in the art at the time 
the invention was made to have modified the speech recognition system as 
taught by Nassiff et al. in view of Hon et al. with the updating a vocabulary 
depending on frequency and time as taught by Hoffmann et al.. The motivation to 
have combined the references involves continuous renewal of the vocabulary to 
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eliminate word snot used often and those not used for a long time (See Hoffmann 
etai, [0015]). 

17. Claims 24 and 25 are rejected under 35 U.S.C. 103(a) as being unpatentable 

over Nassiff et. in view of Gould (EP 0773 532 A2). 

As to claim 24, Nassiff et al. teaches a method of learning with an automatic 

speech recognition system, the method comprising: 

detecting a change to dictated text (see col. 5, lines 50-61 , change is 
detected by a typing over the dictated word or deletion.) 

inferring whether the change is a correction (see col., lines 60-61) based 
at least partially upon the number of words changed (e.g. It is obvious to the 
reference that the number of words are taken into consideration to find out which 
words were changed (see col. 5, lines 58-61, where replacement words and 
dictated words are one or more words. The deletion or typing over makes the 
inferring obvious in order to determine which words were edited or corrected.); 
and 

if the change is inferred to be a correction, selectively learning from the 
nature of the correction (see col. 7, lines 43-61, language models are updated or 
learned from the correction of steep to step.) 

The Gould reference is applied to show the sue of specific number of 
words that are to be corrected. Gould teaches the determine the number of 
words that are corrected (see. Page 7, lines 13-19) (e.g. The user can correct a 
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predetermined number of user's last utterances which is determined to be 
misrecognized. The limit used in the reference is three,. Further the detection of 
a change is determined by saying a phrase or typing or through mouse selection. 
The use of mouse selection allows the computer to realize which words need to 
be corrected (see page 7, lines 20-27)). 

It would have been obvious to one of ordinary skilled in the art at the time 
the invention was made to have modified the correction of dictated speech of 
Nassiff et al. with the inclusion of determining number of words as taught by 
Gould. The motivation to have combined the references involves the editing of 
misrecognized words and words recognized correctly but user changes mind as 
would benefit the system presented by Nassiff etal. to allow correctly recognized 
words to be changed as well as misrecognized words (see Gould page 5, lines 
56-58 and page 2, lines 22-29). 



As to claim 25, Nassiff in view of Gould teach all of the limitations as in claim 24, 

Furthermore, Gould teaches wherein if the change is inferred to be a 
correction, requesting a user confirmation (see page 7, lines 20-36, correction 
window pops up as well as spelling window) (e.g. Based on the system 
determining that a change in the dictated text is found by a command or 
selection, the change being a correction is verified by the use of a correction or 
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spelling window that pops up for the user to edit or correct the entry. This is the 
confirmation as to whether a correction needs to be made) 



Conclusion 

18. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. 

Bahl etal. (US 6,377,921) is cited to disclose identifying mismatches between 
actual and assumed pronunciations. Yegnanarayanan et al. (US 6,490,555) is cited to 
disclose a forced alignment of sequences. Qin et al. (US 6,513,005) is cited to disclose 
correcting error characters in speech recognition. Stevens (US 6,912,498) is cited to 
disclose error correction in speech recognition. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to PARAS SHAH whose telephone number is (571)270- 
1650. The examiner can normally be reached on MON.-THURS. 7:00a. m.-4:00p.m. 
EST. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Patrick Edouard can be reached on (571)272-7603. The fax phone number 
for the organization where this application or proceeding is assigned is 571-273-8300. 
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Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 
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